Fast optimal load balancing algorithms for 1D partitioning
نویسندگان
چکیده
One-dimensional decompositionof nonuniformworkload arrays for optimal load balancing is investigated. The problemhas been studied in the literature as "chains-on-chains partitioning" problem. Despite extensive research efforts, heuristics are still used in parallel computing community with the "hope" of good decompositions and the "myth" of exact algorithms being hard to implement and not runtime efficient. The main objective of this paper is to show that using exact algorithms instead of heuristics yields significant load balance improvements with negligible increase in preprocessing time. We provide detailed pseudocodes of our algorithms so that our results can be easily reproduced. We start with a review of literature on chains-on-chains partitioning problem. We propose improvements on these algorithms as well as efficient implementation tips. We also introduce novel algorithms, which are asymptotically and runtime efficient. We experimented with datasets from two different applications: Sparse matrix computations and Direct volume rendering. Experiments showed that the proposed algorithms are 100 times faster than a single sparse-matrix vector multiplication for 64-way decompositions on average. Experiments also verify that load balance can be significantly improved by using exact algorithms instead of heuristics. These two findings show that exact algorithms with efficient implementations discussed in this paper can effectively replace heuristics.
منابع مشابه
Sparse matrix decomposition with optimal load balancing
Optimal load balancing in sparse matrix decomposition without disturbing the row/column ordering is investigated. Both asymptotically and run-time efficient exact algorithms are proposed and implemented for one-dimensional (1D) striping and two-dimensional (2D) jagged partitioning. Binary search method is successfully adopted to 1D striped decomposition by deriving and exploiting a good upper b...
متن کاملHighly scalable SFC-based dynamic load balancing and its application to atmospheric modeling
Load balance is one of the major challenges for efficient supercomputing, especially for applications that exhibit workload variations. Various dynamic load balancing and workload partitioning methods have been developed to handle this issue by migrating workload between nodes periodically during the runtime. However, on today’s top HPC systems – and even more so on future exascale systems – ru...
متن کاملApplying graph partitioning methods in measurement-based dynamic load balancing
Load imbalance leads to an increasing waste of resources as an application is scaled to more and more processors. Achieving the best parallel efficiency for a program requires optimal load balancing which is a NP-hard problem. However, finding near-optimal solutions to this problem for complex computational science and engineering applications is becoming increasingly important. Charm++, a migr...
متن کاملAn Optimal Balanced Partitioning of a Set of 1D Intervals
Given a set of 1D intervals and a desired partition number, this paper studies on how to make an optimal partitioning of these intervals, such that the number of intervals between the largest partition and smallest partition is minimal among all possible partitioning schemes. Though seemingly easy at the first glance, this problem has its difficulty due to the fact that an interval ``striding''...
متن کاملImage-Space Decomposition Algorithms for Sort-First Parallel Volume Rendering of Unstructured Grids
Twelve adaptive image-space decomposition algorithms are presented for sort-first parallel direct volume rendering (DVR) of unstructured grids on distributed-memory architectures. The algorithms are presented under a novel taxonomy based on the dimension of the screen decomposition, the dimension of the workload arrays used in the decomposition, and the scheme used for workload-array creation a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 64 شماره
صفحات -
تاریخ انتشار 2004